library(arules)
library(arulesViz)
library(httpuv)
setwd("~/Desktop/Assignments&Coursework/422/Hw-2-March4/")
rm(list=ls())

Read the data directly in as transactions and inspect them.

trans <- read.transactions("mba-2.csv", sep=",")
summary(trans)
transactions as itemMatrix in sparse format with
 24 rows (elements/itemsets/transactions) and
 8 columns (items) and a density of 0.349 

most frequent items:
conditioner        milk       bread        coke        beer     (Other) 
         10          10           9           9           8          21 

element (itemset/transaction) length distribution:
sizes
 1  2  3  4  7 
 2 11  4  6  1 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00    2.00    2.00    2.79    4.00    7.00 

includes extended item information - examples:
  labels
1   beer
2  bread
3   coke

See the first 5 items in the transactions database

inspect(trans[1:5])
    items                   
[1] {bread,milk}            
[2] {beer,bread,diaper,eggs}
[3] {beer,coke,diaper,milk} 
[4] {beer,bread,diaper,milk}
[5] {bread,diaper,milk}     

Get familiar with the data

itemFrequencyPlot(trans, support = 0.1)

image(trans)

Now, let’s run Apriori on the dataset. Note that we only get one rule. Why?

rules <- apriori(trans)
Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen maxlen target   ext
        0.8    0.1    1 none FALSE            TRUE       5     0.1      1     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 2 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[8 item(s), 24 transaction(s)] done [0.00s].
sorting and recoding items ... [8 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 done [0.00s].
writing ... [1 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].
rm(rules)

We get one rule since our minsup is set too high (0.1). Let’s reduce it.

rules <- apriori(trans, parameter = list(support=0.01))
Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen maxlen target   ext
        0.8    0.1    1 none FALSE            TRUE       5    0.01      1     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 98 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[169 item(s), 9835 transaction(s)] done [0.00s].
sorting and recoding items ... [88 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 done [0.00s].
writing ... [0 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].
summary(rules)
set of 0 rules
rules <- apriori(trans, parameter = list(support=0.001))
Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen maxlen target   ext
        0.8    0.1    1 none FALSE            TRUE       5   0.001      1     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 9 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[169 item(s), 9835 transaction(s)] done [0.00s].
sorting and recoding items ... [157 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 5 6 done [0.01s].
writing ... [410 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].
summary(rules)
set of 410 rules

rule length distribution (lhs + rhs):sizes
  3   4   5   6 
 29 229 140  12 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   3.00    4.00    4.00    4.33    5.00    6.00 

summary of quality measures:
    support          confidence         lift           count     
 Min.   :0.00102   Min.   :0.800   Min.   : 3.13   Min.   :10.0  
 1st Qu.:0.00102   1st Qu.:0.833   1st Qu.: 3.31   1st Qu.:10.0  
 Median :0.00122   Median :0.846   Median : 3.59   Median :12.0  
 Mean   :0.00125   Mean   :0.866   Mean   : 3.95   Mean   :12.3  
 3rd Qu.:0.00132   3rd Qu.:0.909   3rd Qu.: 4.34   3rd Qu.:13.0  
 Max.   :0.00315   Max.   :1.000   Max.   :11.24   Max.   :31.0  

mining info:
  data ntransactions support confidence
 trans          9835   0.001        0.8

Let’s inspect the rules, sorted by confidence

inspect(head(rules, by="confidence"))
    lhs                     rhs           support confidence lift count
[1] {diaper,shampoo}     => {eggs}        0.0417  1          4.0  1    
[2] {eggs,shampoo}       => {beer}        0.0833  1          3.0  2    
[3] {beer,shampoo}       => {eggs}        0.0833  1          4.0  2    
[4] {eggs,shampoo}       => {conditioner} 0.0833  1          2.4  2    
[5] {conditioner,eggs}   => {shampoo}     0.0833  1          3.0  2    
[6] {conditioner,diaper} => {eggs}        0.0417  1          4.0  1    

We can even interactively plot the rules and examine them.

plot(rules, engine="htmlwidget")

You can drill down into rules that have a certain consequent you are looking for as follows:

rules.beer <- apriori(trans, parameter=list(supp=0.01),
                 appearance = list(default="lhs", rhs="beer"))
Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen maxlen target   ext
        0.8    0.1    1 none FALSE            TRUE       5    0.01      1     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 0 

set item appearances ...[1 item(s)] done [0.00s].
set transactions ...[8 item(s), 24 transaction(s)] done [0.00s].
sorting and recoding items ... [8 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 5 6 7 done [0.00s].
writing ... [43 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].
inspect(sort(rules.beer, decreasing = T, by="support"))
     lhs                                            rhs    support confidence lift count
[1]  {eggs,shampoo}                              => {beer} 0.0833  1          3    2    
[2]  {conditioner,eggs}                          => {beer} 0.0833  1          3    2    
[3]  {coke,diaper}                               => {beer} 0.0833  1          3    2    
[4]  {conditioner,eggs,shampoo}                  => {beer} 0.0833  1          3    2    
[5]  {coke,diaper,milk}                          => {beer} 0.0833  1          3    2    
[6]  {diaper,shampoo}                            => {beer} 0.0417  1          3    1    
[7]  {conditioner,diaper}                        => {beer} 0.0417  1          3    1    
[8]  {diaper,eggs,shampoo}                       => {beer} 0.0417  1          3    1    
[9]  {coke,eggs,shampoo}                         => {beer} 0.0417  1          3    1    
[10] {eggs,milk,shampoo}                         => {beer} 0.0417  1          3    1    
[11] {coke,diaper,eggs}                          => {beer} 0.0417  1          3    1    
[12] {conditioner,diaper,eggs}                   => {beer} 0.0417  1          3    1    
[13] {coke,conditioner,eggs}                     => {beer} 0.0417  1          3    1    
[14] {coke,eggs,milk}                            => {beer} 0.0417  1          3    1    
[15] {conditioner,eggs,milk}                     => {beer} 0.0417  1          3    1    
[16] {coke,diaper,shampoo}                       => {beer} 0.0417  1          3    1    
[17] {conditioner,diaper,shampoo}                => {beer} 0.0417  1          3    1    
[18] {diaper,milk,shampoo}                       => {beer} 0.0417  1          3    1    
[19] {coke,conditioner,diaper}                   => {beer} 0.0417  1          3    1    
[20] {conditioner,diaper,milk}                   => {beer} 0.0417  1          3    1    
[21] {coke,conditioner,milk}                     => {beer} 0.0417  1          3    1    
[22] {coke,diaper,eggs,shampoo}                  => {beer} 0.0417  1          3    1    
[23] {conditioner,diaper,eggs,shampoo}           => {beer} 0.0417  1          3    1    
[24] {diaper,eggs,milk,shampoo}                  => {beer} 0.0417  1          3    1    
[25] {coke,conditioner,eggs,shampoo}             => {beer} 0.0417  1          3    1    
[26] {coke,eggs,milk,shampoo}                    => {beer} 0.0417  1          3    1    
[27] {conditioner,eggs,milk,shampoo}             => {beer} 0.0417  1          3    1    
[28] {coke,conditioner,diaper,eggs}              => {beer} 0.0417  1          3    1    
[29] {coke,diaper,eggs,milk}                     => {beer} 0.0417  1          3    1    
[30] {conditioner,diaper,eggs,milk}              => {beer} 0.0417  1          3    1    
[31] {coke,conditioner,eggs,milk}                => {beer} 0.0417  1          3    1    
[32] {coke,conditioner,diaper,shampoo}           => {beer} 0.0417  1          3    1    
[33] {coke,diaper,milk,shampoo}                  => {beer} 0.0417  1          3    1    
[34] {conditioner,diaper,milk,shampoo}           => {beer} 0.0417  1          3    1    
[35] {coke,conditioner,milk,shampoo}             => {beer} 0.0417  1          3    1    
[36] {coke,conditioner,diaper,milk}              => {beer} 0.0417  1          3    1    
[37] {coke,conditioner,diaper,eggs,shampoo}      => {beer} 0.0417  1          3    1    
[38] {coke,diaper,eggs,milk,shampoo}             => {beer} 0.0417  1          3    1    
[39] {conditioner,diaper,eggs,milk,shampoo}      => {beer} 0.0417  1          3    1    
[40] {coke,conditioner,eggs,milk,shampoo}        => {beer} 0.0417  1          3    1    
[41] {coke,conditioner,diaper,eggs,milk}         => {beer} 0.0417  1          3    1    
[42] {coke,conditioner,diaper,milk,shampoo}      => {beer} 0.0417  1          3    1    
[43] {coke,conditioner,diaper,eggs,milk,shampoo} => {beer} 0.0417  1          3    1    
LS0tCnRpdGxlOiAiQ1M0MjI6IEFzc29jaWF0aW9uIHJ1bGVzIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKYXV0aG9yOiAiVmlqYXkgSy4gR3VyYmFuaSwgUGguRC4sIElsbGlub2lzIEluc3RpdHV0ZSBvZiBUZWNobm9sb2d5IgotLS0KCmBgYHtyfQpsaWJyYXJ5KGFydWxlcykKbGlicmFyeShhcnVsZXNWaXopCmxpYnJhcnkoaHR0cHV2KQpzZXR3ZCgifi9EZXNrdG9wL0Fzc2lnbm1lbnRzJkNvdXJzZXdvcmsvNDIyL0h3LTItTWFyY2g0LyIpCnJtKGxpc3Q9bHMoKSkKYGBgCiMjIyMgUmVhZCB0aGUgZGF0YSBkaXJlY3RseSBpbiBhcyAqKnRyYW5zYWN0aW9ucyoqIGFuZCBpbnNwZWN0IHRoZW0uCmBgYHtyfQp0cmFucyA8LSByZWFkLnRyYW5zYWN0aW9ucygibWJhLTIuY3N2Iiwgc2VwPSIsIikKc3VtbWFyeSh0cmFucykKYGBgCiMjIyMgU2VlIHRoZSBmaXJzdCA1IGl0ZW1zIGluIHRoZSB0cmFuc2FjdGlvbnMgZGF0YWJhc2UKYGBge3J9Cmluc3BlY3QodHJhbnNbMTo1XSkKCmBgYAojIyMjIEdldCBmYW1pbGlhciB3aXRoIHRoZSBkYXRhCmBgYHtyfQppdGVtRnJlcXVlbmN5UGxvdCh0cmFucywgc3VwcG9ydCA9IDAuMSkKaW1hZ2UodHJhbnMpCmBgYAojIyMgTm93LCBsZXQncyBydW4gQXByaW9yaSBvbiB0aGUgZGF0YXNldC4gIE5vdGUgdGhhdCB3ZSBvbmx5IGdldCBvbmUgcnVsZS4gIFdoeT8KYGBge3J9CnJ1bGVzIDwtIGFwcmlvcmkodHJhbnMpCnJtKHJ1bGVzKQpgYGAKIyMjIyBXZSBnZXQgb25lIHJ1bGUgc2luY2Ugb3VyIG1pbnN1cCBpcyBzZXQgdG9vIGhpZ2ggKDAuMSkuICBMZXQncyByZWR1Y2UgaXQuCmBgYHtyfQpydWxlcyA8LSBhcHJpb3JpKHRyYW5zLCBwYXJhbWV0ZXIgPSBsaXN0KHN1cHBvcnQ9MC4wMSkpCnN1bW1hcnkocnVsZXMpCgpgYGAKIyMjIyBMZXQncyBpbnNwZWN0IHRoZSBydWxlcywgc29ydGVkIGJ5IGNvbmZpZGVuY2UKYGBge3J9Cmluc3BlY3QoaGVhZChydWxlcywgYnk9ImNvbmZpZGVuY2UiKSkKYGBgCiMjIyMgV2UgY2FuIGV2ZW4gaW50ZXJhY3RpdmVseSBwbG90IHRoZSBydWxlcyBhbmQgZXhhbWluZSB0aGVtLgpgYGB7cn0KcGxvdChydWxlcywgZW5naW5lPSJodG1sd2lkZ2V0IikKYGBgCiMjIyMgWW91IGNhbiBkcmlsbCBkb3duIGludG8gcnVsZXMgdGhhdCBoYXZlIGEgY2VydGFpbiBjb25zZXF1ZW50IHlvdSBhcmUgbG9va2luZyBmb3IgYXMgZm9sbG93czoKYGBge3J9CnJ1bGVzLmJlZXIgPC0gYXByaW9yaSh0cmFucywgcGFyYW1ldGVyPWxpc3Qoc3VwcD0wLjAxKSwKICAgICAgICAgICAgICAgICBhcHBlYXJhbmNlID0gbGlzdChkZWZhdWx0PSJsaHMiLCByaHM9ImJlZXIiKSkKaW5zcGVjdChzb3J0KHJ1bGVzLmJlZXIsIGRlY3JlYXNpbmcgPSBULCBieT0ic3VwcG9ydCIpKQpgYGAKCg==